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Summary 

A nonparametric maximum likelihood procedure is given for estimating the survivor function from 
right-censored data. It approximates the hazard rate by a simple function such as a spline, with 
different approximations yielding different estimators. A special case is that proposed by Nelson 
(1969, Journal of Quality Technology 1, 27-52) and Altshuler (1970, Mathematical Biosciences 6, 
1-11). The estimators are uniformly consistent and have the same asymptotic weak convergence 
properties as the Kaplan-Meier (1958, Journal of the American Statistical Association 53, 457-481) 
estimator. However, in small and in heavily censored samples, the simplest spline estimators have 
uniformly smaller mean squared error than do the Kaplan-Meier and Nelson-Altshuler estimators. 
The procedure is extended to estimate the baseline hazard rate and regression coefficients in the Cox 
(1972, Journal of the Royal Statistical Society, Series B 34, 187-220) proportional hazards model 
and is illustrated using experimental carcinogenesis data. 



1. Introduction 

We describe a nonparametric procedure for estimating the probability F{x) of surviving to 
time X, using a random sample Xi , . . . , X„ of death times. The X/ are censored on the right 
by random variables C, so that one observes only min(X/, C), / = 1, . . . , The C/'s are 
a random sample, drawn independently of the X/, from a distribution G{c) = Pr(C > c). 
We let yi ^ • • • =^ yn denote the ordered observations, and let 6/ = I(X(/) ^ Q/)) be an 
indicator for the event that y/ is uncensored. 

Kaplan and Meier (1958), and Nelson (1969) and Altshuler (1970) developed the 
nonparametric estimators Fkm and Fna, defined by 

^km(o = n , 

y.^t \/1 — / + 1/ 

Fna(0 = exp[-ANA(OL Ana(0= I (1.1) 
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Ana(0 estimates the cumulative hazard k{t) = -In F{t). When the number n - i of items 
on test at time t is large, Fkm(0 ^i^d FnaU) are in close agreement. However, they share 
three disadvantages. First, they do not provide a useful estimate of the hazard rate 
X{t) = —(d/dt) In F(t), which is often of central interest. Second, they do not use the precise 
censoring times. Third, their efficiency in large samples is poor. Miller (1983) found 
asymptotic efficiency as low as 50% for the Kaplan-Meier estimator relative to certain 
parametric estimators. 

We present a class of estimators for F that provide estimates for X and use the precise 
censoring times. They are obtained in Section 2 by maximizing the log-likelihood function 
with X represented by a spline or other simple function. They have the same asymptotic 
weak convergence properties as the Kaplan-Meier and Nelson-Altshuler estimators, so 
they provide no gain in asymptotic efficiency. However, in Section 3 we show that they are 
considerably more efficient in small and moderate samples, and their efficiency relative to 
the Kaplan-Meier and Nelson-Altshuler estimators increases as the censoring becomes 
heavier. Section 4 extends the method to estimate the baseline hazard rate and regression 
coefficients in the Cox (1972) proportional hazards model. An example is presented in 
Section 5. 



2. Theory 

We wish to use the likelihood of the data (y,, 6/), / = 1, . . . , /i, to estimate F over an 
interval [0, T]. The log-likelihood is a sum of a term L involving F and a term involving 
G (Kalbfleisch and Prentice, 1980). To maximize it with respect to F we express L as a 
functional of the hazard rate X: 



L(X) = I 



6/log Xiyi) - j X{u) du 
Jo 



(2.1) 



We shall approximate X in (2.1) by a function X^(0 obtained by introducing the knots 
0 = to< t\ < • • • <tk< tk+i = T, and defining 

Xe(0 = I Oj{t)Xj. (2.2) 

7=0 

Here Xj = X{tj) and Sj is a known function of t. The integral of Xq, which approximates the 
cumulative hazard A(/), is 

k+x n k+l 
Xq{u) du= ^ Oj{u) du^ Y oLj{t)Xj, (2.3) 

The coefficients aj(t) are known functions which do not depend on X. 

We take the tj, j = U . . . , k, to be the k distinct failure times. Then in (2.1) we replace 
the integral of X by that of X^ and use (2.3) to obtain 

k+\ n 

L(^q) = 1 (djlog Xj - jjXjl where yj = S ajiyi). (2.4) 

j=o /■= 1 

Here dj denotes the number of failures at time tj, j = I, . . . , k, do = dk+i = 0. Evidently 
(2.4) is the log-likelihood kernel for A: H- 2 independent Poisson variates dj with means jjXj, 
7 = 0, 1, ...,/: H- 1. The solution to its likelihood equations is 

h = djy7'. (2.5) 
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When these values \j are used in (2.2) and (2.3), they yield the estimates 

M0 = 'S Oj{t)djyj\ (2.6) 

7=0 

Ae(0 = i aj{t) djy-\ Fait) = exp[-Ae(0]. (2.7) 

7=1 

Section 3 describes the large- and small-sample properties that justify these estimators. Now 
we present some special cases, corresponding to particular choices of Oj. 

When Sj{t) = b{t - tj\ where 6(-) is the Dirac delta function, aj{t) becomes the indicator 
aj{t) = l{t ^ tj), 7y becomes the number A^; of observations at risk at time tj, and (2.7) yields 

FQ^it) = exp^-- dj/N)j. (2.8) 

Comparison of (2.8) with (1.1) shows that Fq^ is the Nelson-Altshuler estimate Fna in the 
absence of tied failure times, but differs slightly from it in the presence of ties. Breslow 
(1972) derived Fq^ as the maximum likelihood estimate of F in the class of distributions 
which have constant hazard rates between uncensored observations, with censored obser- 
vations assumed censored just after the preceding failure time. 

The hazard rate estimate \qJJ:) in (2.6) is a sum of delta functions. By using functions 
dj{t) derived from simple quadrature rules, one can obtain smoother hazard rate estimates, 
as the following examples illustrate: 

(i) Left rectangular rule Qx : Set (?o(0 = 0 and djit) = l{tj-i < t ^ tj); j = \, . . . , k -\- \. 
Then Xq in (2.2) is the left-continuous step function Xq^U) = Xy+i, tj < t ^ tj+i, 
7 = 0,...,/: (See Figure I A), By using Oj{t) in (2.3) we find that ajit) = aj{t\ where 

ro, 0 ^ / ^ tj-A 

aj(t) = y - tj-u tj-i ^t^tjV (2.9) 

[tj-tj-i^Aj, tj^t^T jj=i,...Mi 

is the "time on test" during the interval [tj-i, tj] of an item observed until t, and ao{t) = 0. 
By using aj = aj in (2.4), one finds that yj is the observed total time on test during 
the interval [tj-\, tj]. The hazard rate estimate Xq^ is also a left-continuous step function, 
with "steps" of height dj/yj. 

(ii) Right rectangular rule Qi, Let dj{t) = l{tj < t ^ tj+i), j = 0, I, . . . , k, with 
0k+i(t) = 0. Then from (2.2), Xq is the right-continuous step function XQ^(t) = Xj, 
tj ^ t < tj+i, j = 0, , , , , k (Figure IB). As in example (i), we find that aj{t) - aj+i{t), 
7 = 0,...,/:, where ak+2 = 0. Here yj is the total time on test during the interval [tj, tj+i]. 

(iii) Trapezoidal rule Q3: Set 

Ojit) = litj-, < t ^ tj)aj{t)/Aj + litj<t ^ /;+,)[V, - aJMt)]/^^u j = U . . . , k, (2.10a) 

Oo{t) = l(to <t<tx )[Ai - a, (0]/Ai , Ok^,{t) = \{tk < t < tk^i)ak^i{t)/Ak^i, (2. 10b) 

Here A; = tj - tj-i is the length of the 7th time interval. The resulting spline function Xq in 
(2.2) is the continuous piecewise linear function 

^(23(0 = — - Vi + 

^7+ 1 ^7+ 1 



This content downloaded from 128.95.155.210 on Tue, 8 Jul 2014 17:59:25 PM 
All use subject to JSTOR Terms and Conditions 



498 



Biometrics, September 1986 





I \ I \ \ I I I 

0 U h h U \ T 

Figure 1. The functions X(/)( ) and \Q{t){ ) given by: {A) the left rectangular rule Qi\ 

(B) the right rectangular rule Q2; and (C) the trapezoidal rule Q3. 



tj^t^ tj+uj = 0, ...,k (Figure IC). Moreover, (2.3) and (2.10) yield 

aj(t) = aj^,(t) + 5[A7^«7(0 - A-+\«^+,(0]. (2.11) 
When we use (2. 1 1) in (2.4) and (2.6) we obtain the result 

K = dj/yj = dj/[cij+i + ^{Ay^C2j - A-+\c2j+i)], j = U . . . , k. (2.12) 

Here Cmj is the sum of the mth powers of the times on test in the interval [tj-i, tj], defined 
by c„y = Sf=i a'fiyi). In particular, Cij is the total time on test in [tj-i, tj] and C2j is the sum 
of the squares of the times on test in [tj-i , tj]. 

With this notation, the estimates for the "step heights" in the left and right rectangular 
rules are Xj = dj/c\j and Xj = dj/cij+i, respectively. If the last item in the sample is 
uncensored, is zero, and therefore Xk is infinite for the right rectangular rule. The 
cumulative hazard estimates A(2,(0 and A^2(0 are closely related. When there are no tied 
failure times, (2.7) shows that 

\Q,it) - AQ^t) = ai,^,(t)cli^, - a,{t)cil, (2.13) 

Both Cu and , the total time on test in the first and last intervals, respectively, increase 
linearly with the number of items n almost surely, provided there is positive probability of 
surviving to time T. Thus, the difference (2.13) is of stochastic order \/n. 



This content downloaded from 128.95.155.210 on Tue, 8 Jul 2014 17:59:25 PM 
All use subject to JSTOR Terms and Conditions 



Survival Estimation Using Splines 



499 



3. Large- and Small-Sample Properties 

The estimators Fq. described in the preceding section are uniformly consistent and have 
the same asymptotic weak convergence properties derived for the Kaplan-Meier and 
Nelson-Altshuler estimators by Breslow and Crowley (1974), Aalen (1978), and Fleming 

Table 1 

Bias, relative efficiency, and root mean squared error of Kaplan-Meier (KM), trapezoidal rule 
(Q3), and smoothed Kaplan-Meier (SKM) estimators based on 500 samples of size n from 
F{t) = (1 + /)"• with average censoring proportion^ ttg. 
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^TTG = S/ S, €///(500/7), where €,/ = 1 if the /th observation in the /th sample is censored and €,/= 0 otherwise; 
/= 1,...,/?;/= 1,...,500. 
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and Harrington (1984). This is proved in unpublished work by Whittemore and Keller 
(Technical Report No. 69, Department of Statistics, Stanford University, 1983). Klotz 
(1982) also has introduced spline function estimators of \{t) using maximum likelihood. 
He has shown the consistency of the estimate based on his J?-spline of order 2, which is 
closely related to the trapezoidal rule estimate. 

We used Monte Carlo simulations to examine the small-sample bias, variance, and root 
mean squared error (rmse) of the Kaplan-Meier estimator Fkm and the estimators Fq., , 
/ = 0, 1,2, 3. We used the six survival functions F = e~\ e~^^, e~^\ (1 + t)~\ (1 + t^y\ 
and e~^ for / ^ In 2, 16^"^^ for / > In 2. They were chosen to represent hazard 
rates that are constant, linearly increasing, exponentially increasing, decreasing, U-shaped, 
and discontinuous. We assumed no censoring, moderate censoring {G = e~^\ and heavy 
censoring {G = e~^^), and sample sizes 10, 25, 50, and 100. For each combination of 
survival function F, censoring function G, and sample size, we generated 500 trials, i.e., 
sets of observations (yi , 6i ),..., (};„, 6„). There are a total of 6 x 3 x 4 = 72 combinations 
yielding 72 x 500 = 36,000 trials. 

For each trial we calculated Fkm(0 and FQ.{t) at each decile t of F. In calculating the 
trapezoidal rule estimator Fq^, we took tk+\ = T so large that A^+iC2,^+i was negligible in 
(2. 12). We omit detailed discussion of the left and right rectangular rule estimators Fq^ and 
Fq^, which tended to underestimate and overestimate F, respectively. Table 1 presents 
results only for F{t) = (1 + t)~\ since results were similar for the other distributions. For 
untied failure data when Fna = Fq^, Fleming and Harrington (1984) found that the rmse 
of Fna was slightly smaller than that of Fkm for t satisfying F(t) ^ .20, and sometimes 
substantially larger when F(t) < .20. We found the same thing. Since Fq^ never outper- 
formed Fq^ in RMSE, we have not included it in Table 1 . 

Table 1 illustrates the following general findings: (i) All estimators are fairly unbiased, 
except in the lower deciles of F and in very small samples (n = 10) with heavy censoring 
(average censoring proportion tt = .75). The small bias for Fkm agrees with the findings of 
Chen, Hollander, and Langberg (1982). (ii) The Kaplan-Meier estimator exhibits more 
bias than does the trapezoidal estimator Fq^ at the 10th and 20th percentiles of F, where 
the data are sparse, (iii) Fq^ has smaller variance than do either the Kaplan-Meier or the 
Nelson-Altshuler estimators, even in the absence of censoring, (iv) Fq^ has smaller rmse's 
than do the Kaplan-Meier and Nelson-Altshuler estimators, and the differences increase 
as the censoring becomes heavier. 

To check how much of the reductions in variance and rmse for Fq^ are due to smooth- 
ness rather than use of the precise censoring times, we also examined the small-sample 
behavior of a "smoothed" Kaplan-Meier estimator Fskm- This estimator is the piecewise 
linear curve obtained by connecting linearly the midpoints of successive "steps" of Fkm. 
The results in Table 1 show that smoothing substantially reduces the variance of the 
Kaplan-Meier estimator Fkm, and that Fq^ and Fskm exhibit virtually identical properties 
in the absence of censoring. However, Fq^ outperforms Fskm in censored samples, most 
markedly when the censoring is heavy, when the sample size is small, and in the lower 
deciles of F, 



4. Survival with Covariates 



Suppose each observation (y/, 6/, z/) includes a vector z/ of covariates that affect survival 
via the hazard rate model X(/; z) = X{t)e^^. Here /9 is a vector of unknown parameters. 
Then the log-likelihood (2.1) becomes 



i(A, /?)= 2 iWog 



Hyde 



Mi 



^0 



\{u) du 



(4.1) 
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Approximating the integral of X in (4.1) by (2.3) yields 



k+\ 



L = I [djlog Xj + 0sj - Tjmj] with rX/9) = I e^^^ajiyd. 

J=0 i=l 

The value s, is the sum of z over the dj items failing at tj, and So = Sk+i = 0. 
From (4.2), L = log[AXi(j8)L2(X, ff)], where AT is independent offf and the X,, and 



501 



(4.2) 



LmL2{\. 



, i8) = n n 

U=i /=i 



/t+l 



[e''^/a,X3;,)/rX/9)] [^^ n [r,(/9)X,r>exp[-X,r,(/9)] \. (4.3) 



Here 7'/ indexes the dj failures occurring at tj. L2 is maximized by setting 
X, = XX/9) = dj/Tj(0l 7 = 0, 1, ...,/:+ 1. 



(4.4a) 



Since L2[X(/9), /9] is independent of /9, L is maximized by choosing 0 to maximize Li(/9). 
Thus, the hazard rate estimate X(/, z) is obtained by using (4.4a) for X; in (2.2) to estimate 
X^(/), and combining the result with e^^: 



k+l 



Ht, z) = e'^ I Oj{t) dj/Tj{0), 

j=0 



(4.4b) 



The factorization (4.3) shows that the second factor can be maximized to get an estimate 
of the baseline hazard rate X(/) no matter how /9 is estimated. Thus, for example, 0 obtained 
by maximizing Cox's partial likelihood could be used in (4.4b). This would yield a smooth 
estimate of X(/; z) with the advantages described in the preceding sections. 

The /9-likelihood equation corresponding to Li(/?) is 



k 
7=1 



s, - dj 2 

i=l 



= 0, 



(4.5a) 



where 



I /=i 

When is a solution of (4.5), then (4.4) and (2.3) yield the estimates 



(4.5b) 



7=1 



(4.6) 



The estimate 0 given by (4.5) for the rule Qo is consistent and asymptotically normal. 
This is so because then aj(t) is the indicator function l(t ^ tj) and in the absence of ties 
Li(/9) becomes Cox's (1972) partial likelihood. Since the probability of ties is zero, the 
results of Tsiatis (1981) and Bailey (1983) for Cox's partial likelihood apply. The fact that 
the number of parameters X; increases with the number of failures does not cause any 
difficulty in this case. In the presence of ties Li(/?) becomes Peto's (1972) approximation 
to Cox's likelihood. Further, Tj{0) of (4.2) is the sum of the terms e^^ over the risk set Rj 
consisting of those items surviving to time tj. From (4.6) the baseline survivor function 
estimate is 



Fq^U) = exp 



(4.7) 



This estimate, due to Breslow (1972), is discussed by Holford (1976). 
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For Qx, aj{t) = aj{t) is given by (2.9). The weight Wij in (4.5) is proportional to the 
"operational time" contributed by the ith item to the interval [tj-i, tj]. In the absence of 
censoring, (2.9) shows that aj{y,) = tj - tj-i for ^ tj and aj{yi) = 0 otherwise. Thus, the 
/9-likelihood equation (4.5) reduces to that obtained using Qo, so the estimate 0 is again 
consistent and asymptotically normal. In the presence of censoring, however, Qo and Qi 
produce different estimates. The left-continuous step function Xq^ is of the form proposed 
by Oakes ( 1 972), who used Qo to estimate /9. 

For the trapezoidal rule, we use (2.1 1) and (4.2) in (4.4) to get 

^Q,(tj) = dj/[Cuj^i + {{Ay' Cy - A7+\ C2,,>,)], 7 = 1, . . . , fc, (4.8) 

where Cnj = C,nj{0) = 1?=\ e^^'af{yi). According to (2.1 1) the weights Wij in (4.5b) vanish 
for items outside the risk sets Rj-\. The weight for the /th item depends on the first and 
second powers of the times it spends in the intervals [tj-\, tj] and [tj, tj+\]\ 

^ e'^'\aMyd^Aj'aJ{yd + ^n^aJ.^{yi)]] 

^ Ci j+ 1 + 5[ Ay ^ C27 — A j+ 1 C2J+ 1 ] 

The matrix of negative derivatives of the left-hand side of (4.6) has (^, i^)th entry 



k 

Aj) 



IM= 1 djv\',U ^, 1, ...,A 
7=1 



where 



n 






I 







(4.9) 



Here z^^^ denotes the ^th component of the p-dimensional vector z. is an estimate of 
the covariance matrix of 0 when 0 is asymptotically normal. In this case the null hypothesis 
/9 = 0 can be tested via the score statistic U(0), given by the left-hand side of (4.5a) evaluated 
at /9 = 0. Then U(0) is normal with zero mean vector and with covariance matrix 1(0), 
so one can refer the statistic U^(0)I(0)"^U(0) to a chi-squared distribution with p 
degrees of freedom. 

5. Example 

Data of Mays and Dougherty (1972) analyzed by Whittemore and McMillan (1982) will 
now be used to illustrate the preceding hazard rate and survivor function estimates. Table 
2 gives ordered values of time to censoring or failure (i.e., death with osteosarcoma) for 
beagle dogs injected with plutonium. The k = 5 uncensored values are denoted with 
asterisks, and doses are given in parentheses. Since there are no tied failure times among 
the data, d\ = - - - = = \ and do = de = 0. Therefore, under the null hypothesis that 
plutonium is unrelated to survival (i.e., = 0), the hazard rate estimate X^(/) of (2.6) 



Table 2 

Time in days to osteosarcoma (*) or censoring for 26 beagles injected with ^^^Pu. 
Doses in fiCi • 10 ^/kg are given in parentheses^. 

1539 (1.42) 2061 (1.68) 2257* (1.68) 2374 (1.67) 3111 (1.56) 3111 (1.59) 

3308* (1.63) 3367* (1.72) 3466 (1.51) 3482 (1.41) 3495 (1.59) 3495 (1.63) 

3495 (1.65) 3539 (1.39) 3539 (1.40) 3539 (1.41) 3649 (1.52) 3764 (1.68) 

3981 (1.67) 4292 (1.40) 4292* (1.65) 4549 (1.39) 4572* (1.50) 4810 (1.63) 

5160 (1.57) 5277 (1.53) 

From Whittemore and McMillan (1982). 
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becomes 

7=1 

Table 3 outlines the calculations of yj needed to compute X^(/). For Qo, 6j{t) = d{ti - tj) 
and yj is the number of items on test before the j\h failure, given in column 5. The 
hazard rate X^^ is a sum of delta functions with the yj^ as coefficients. For the left and 
right rectangular rules Qx and Q2, the Oj{t) are indicator functions and the yj are the total 
times on test Cxj and Cij+i, respectively, shown in column 3. (The table does not give the 
total time on test after the last failure, which is Ci,6 = 1531 days.) The resulting rectangular 
rule estimates and are step functions with the yj^ as steps. For the trapezoidal rule 
03, Oj{t) is given by (2.10). The quantities 7^, shown in column 6, are obtained from 



Table 3 



Quantities needed to compute hazard rate and score statistic for beagle data 


Failure 
time 

tj 


Interval 
length 

A, 


Time on test Cmj 




yj 


Dose 

^j 




(0)z,- 


li ^^^ij{0)z] 


Cxj • 10-^ 


cij • 10-^ 


Go 




Go 




Go 


23 


2257 
3308 
3367 
4292 
4572 


2257 
1051 
59 
925 
280 


51.11 
22.85 
1.12 
8.88 
1.38 


128.87 
23.56 
0.07 
6.76 
0.38 


24 
20 
19 
6 
4 


40,184.3 
11,770.9 
5,786.1 
4,355.1 
2,208.9 


1.68 
1.63 
1.72 
1.65 
1.50 


1.558 
1.544 
1.539 
1.545 
1.558 


1.556 
1.546 
1.538 
1.533 
1.551 


2.437 
2.396 
2.382 
2.395 
2.428 


2.432 
2.401 
2.376 
2.360 
2.408 



^(0) = Ij zj - Ij li = 0.437 (Qo); 0.456 (ft). 

m = Ij li MO)zl - Ij Hi wu{0)z,r = 0.046 (Go); 0.046 (G3). 



T 1 1 1 1 r 




1000 2000 3000 4000 5000 6000 



t (days) 

Figure 2. The Kaplan-Meier estimate Fkm(0 of the survivor function for the beagle data is shown 
by the solid line, and the trapezoidal rule estimate FQ^(t) is shown by the dashed line, with the scale 
at the left. The trapezoidal rule estimate XQ^{t) of the hazard rate is shown by the solid line and open 
circles, with the scale at the right. The horizontal axis denotes t in days. 
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columns 2-4 by using (2.12). The resulting piecewise linear hazard rate X^3(/) is plotted in 
Figure 2. It suggests that X(/) is proportional to a power of /, as is often true of carcinogenesis 
data. 

Figure 2 also shows that the survivor function estimator Fq^ exceeds the Kaplan-Meier 
estimator at each failure time tj. This difference reflects the use in Fq^ of time on test in 
the interval tj] by items censored in this interval. Use of this information decreases 
the estimated cumulative hazard, and thus increases the survivor function estimate. 
Therefore, in samples with heavy censoring, the spline function estimators for F will tend 
to exceed the Kaplan-Meier and Nelson-Altshuler estimators. 

Table 3 also outlines the calculations needed to compute the score statistic (7(0) and its 
variance estimate 7(0), using both the delta function rule Qo and the trapezoidal rule 03. 
Since the likelihood equation for Qo is Cox's partial likelihood equation, the score statistic 
for Qo is the one proposed by Cox. The normalized statistics U\G)/I{G) are 4.152 for Qo 
and 4.520 for 03, in close agreement with each other. When compared to the chi-squared 
distribution with 1 degree of freedom, both indicate significantly decreased survival with 
increased dose of plutonium. 

6. Discussion 

Two properties of the estimators Fq^, Fq^, and Fq^ account for their increased small-sample 
efficiency relative to the Kaplan-Meier and Nelson-Altshuler estimators. First, they use 
the exact survival times of the censored items. Second, they yield a continuous survivor 
function. One or both of these properties are shared by other estimators. These include the 
nonparametric Bayesian estimators proposed by Susarla and Van Ryzin (1976, 1978) and 
by Ferguson and Phadia (1979), and the monotone hazard rate estimators of Grenander 
(1956) and Marshall and Proschan (1965). Padgett and Wei (1980) and Mykytyn and 
Santner (1981) extended the latter to censored data and to distributions having a U-shaped 
hazard rate. They showed strong consistency of the estimators and used simulations to 
examine their small-sample properties. 

The increasing hazard rate and right rectangular rule estimators are closely related. Each 
provides a step function estimate of the hazard rate with jumps at the failure times. The 
two estimates agree exactly if the rectangular rule hazard rate estimate is increasing. The 
same close agreement holds for the decreasing hazard rate and left rectangular rule 
estimators. However, the monotone hazard rate estimators are inferior to the rectangular 
rule estimators in two ways. First, they impose monotonicity constraints on the hazard 
rate; and second, they exhibit large bias in small samples even when these constraints are 
satisfied (Mykytyn and Santner, 1981). Kalbfleisch and Prentice (1973) suggested using the 
rectangular rule hazard rates for estimation, with fixed knots. 

Anderson and Senthilselvan (1980) suggested maximizing a penalized log-likelihood 
function to obtain piecewise smooth estimates of the baseline hazard function in the 
proportional hazards model. Their procedure, which depends on an arbitrary smoothness 
parameter, estimates /9 by maximizing Cox's partial likelihood. 

Resume 

Ce papier presente une methode non parametrique du maximum de vraisemblance pour estimer une 
fonction de survie quand les donnees sent censurees a droite. Elle approxime le risque instantane par 
une fonction simple telle qu'une fonction spline, differentes approximations fournissant differentes 
estimations. Un cas special a ete propose par Nelson (1969, Journal of Quality Technology 1, 27-52) 
et Altshuler ( 1 970, Mathematical Biosciences 6, 1-11). Les estimations sont uniformement consistents 



This content downloaded from 128.95.155.210 on Tue, 8 Jul 2014 17:59:25 PM 
All use subject to JSTOR Terms and Conditions 



Survival Estimation Using Splines 



505 



et ont les memes proprietes de convergence faible que I'estimateur de Kaplan et Meier (1958, Journal 
of the American Statistical Association 53, 457-481). Cependant si les echantillons sont petits et 
fortement censures, les estimateurs spline les plus simples ont un ecart quadratique moyen uniforme- 
ment plus faible que les estimateurs de Kaplan-Meier et Nelson-Altshuler. La methode s'etend a 
I'estimation du risque instantane et des coefficients de regression dans le modele des risques instantanes 
proportionnels de Cox (1972, Journal of the Royal Statistical Society, Series B 34, 187-220); elle est 
illustree par un exemple de carcinogenese experimentale. 
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